116 research outputs found
Behavior Planning For Connected Autonomous Vehicles Using Feedback Deep Reinforcement Learning
With the development of communication technologies, connected autonomous
vehicles (CAVs) can share information with each other. We propose a novel
behavior planning method for CAVs to decide actions such as whether to change
lane or keep lane based on the observation and shared information from
neighbors, and to make sure that there exist corresponding control maneuvers
such as acceleration and steering angle to guarantee the safety of each
individual autonomous vehicle. We formulate this problem as a hybrid partially
observable Markov decision process (HPOMDP) to consider objectives such as
improving traffic flow efficiency and driving comfort and safety requirements.
The discrete state transition is determined by the proposed feedback deep
Q-learning algorithm using the feedback action from an underlying controller
based on control barrier functions. The feedback deep Q-learning algorithm we
design aims to solve the critical challenge of reinforcement learning (RL) in a
physical system: guaranteeing the safety of the system while the RL is
exploring the action space to increase the reward. We prove that our method
renders a forward invariant safe set for the continuous state physical dynamic
model of the system while the RL agent is learning. In experiments, our
behavior planning method can increase traffic flow and driving comfort compared
with the intelligent driving model (IDM). We also validate that our method
maintains safety during the learning process.Comment: conferenc
Spatial-Temporal-Aware Safe Multi-Agent Reinforcement Learning of Connected Autonomous Vehicles in Challenging Scenarios
Communication technologies enable coordination among connected and autonomous
vehicles (CAVs). However, it remains unclear how to utilize shared information
to improve the safety and efficiency of the CAV system. In this work, we
propose a framework of constrained multi-agent reinforcement learning (MARL)
with a parallel safety shield for CAVs in challenging driving scenarios. The
coordination mechanisms of the proposed MARL include information sharing and
cooperative policy learning, with Graph Convolutional Network (GCN)-Transformer
as a spatial-temporal encoder that enhances the agent's environment awareness.
The safety shield module with Control Barrier Functions (CBF)-based safety
checking protects the agents from taking unsafe actions. We design a
constrained multi-agent advantage actor-critic (CMAA2C) algorithm to train safe
and cooperative policies for CAVs. With the experiment deployed in the CARLA
simulator, we verify the effectiveness of the safety checking, spatial-temporal
encoder, and coordination mechanisms designed in our method by comparative
experiments in several challenging scenarios with the defined hazard vehicles
(HAZV). Results show that our proposed methodology significantly increases
system safety and efficiency in challenging scenarios.Comment: This paper has been accepted by the 2023 IEEE International
Conference on Robotics and Automation (ICRA 2023). 6 pages, 5 figure
What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?
Various types of Multi-Agent Reinforcement Learning (MARL) methods have been
developed, assuming that agents' policies are based on true states. Recent
works have improved the robustness of MARL under uncertainties from the reward,
transition probability, or other partners' policies. However, in real-world
multi-agent systems, state estimations may be perturbed by sensor measurement
noise or even adversaries. Agents' policies trained with only true state
information will deviate from optimal solutions when facing adversarial state
perturbations during execution. MARL under adversarial state perturbations has
limited study. Hence, in this work, we propose a State-Adversarial Markov Game
(SAMG) and make the first attempt to study the fundamental properties of MARL
under state uncertainties. We prove that the optimal agent policy and the
robust Nash equilibrium do not always exist for an SAMG. Instead, we define the
solution concept, robust agent policy, of the proposed SAMG under adversarial
state perturbations, where agents want to maximize the worst-case expected
state value. We then design a gradient descent ascent-based robust MARL
algorithm to learn the robust policies for the MARL agents. Our experiments
show that adversarial state perturbations decrease agents' rewards for several
baselines from the existing literature, while our algorithm outperforms
baselines with state perturbations and significantly improves the robustness of
the MARL policies under state uncertainties
Shared Information-Based Safe And Efficient Behavior Planning For Connected Autonomous Vehicles
The recent advancements in wireless technology enable connected autonomous
vehicles (CAVs) to gather data via vehicle-to-vehicle (V2V) communication, such
as processed LIDAR and camera data from other vehicles. In this work, we design
an integrated information sharing and safe multi-agent reinforcement learning
(MARL) framework for CAVs, to take advantage of the extra information when
making decisions to improve traffic efficiency and safety. We first use weight
pruned convolutional neural networks (CNN) to process the raw image and point
cloud LIDAR data locally at each autonomous vehicle, and share CNN-output data
with neighboring CAVs. We then design a safe actor-critic algorithm that
utilizes both a vehicle's local observation and the information received via
V2V communication to explore an efficient behavior planning policy with safety
guarantees. Using the CARLA simulator for experiments, we show that our
approach improves the CAV system's efficiency in terms of average velocity and
comfort under different CAV ratios and different traffic densities. We also
show that our approach avoids the execution of unsafe actions and always
maintains a safe distance from other vehicles. We construct an
obstacle-at-corner scenario to show that the shared vision can help CAVs to
observe obstacles earlier and take action to avoid traffic jams.Comment: This paper gets the Best Paper Award in the DCAA workshop of AAAI
202
Uncertainty Quantification of Collaborative Detection for Self-Driving
Sharing information between connected and autonomous vehicles (CAVs)
fundamentally improves the performance of collaborative object detection for
self-driving. However, CAVs still have uncertainties on object detection due to
practical challenges, which will affect the later modules in self-driving such
as planning and control. Hence, uncertainty quantification is crucial for
safety-critical systems such as CAVs. Our work is the first to estimate the
uncertainty of collaborative object detection. We propose a novel uncertainty
quantification method, called Double-M Quantification, which tailors a moving
block bootstrap (MBB) algorithm with direct modeling of the multivariant
Gaussian distribution of each corner of the bounding box. Our method captures
both the epistemic uncertainty and aleatoric uncertainty with one inference
pass based on the offline Double-M training process. And it can be used with
different collaborative object detectors. Through experiments on the
comprehensive collaborative perception dataset, we show that our Double-M
method achieves more than 4X improvement on uncertainty score and more than 3%
accuracy improvement, compared with the state-of-the-art uncertainty
quantification methods. Our code is public on
https://coperception.github.io/double-m-quantification.Comment: 6 pages, 3 figure
Multi-Agent Reinforcement Learning Guided by Signal Temporal Logic Specifications
Reward design is a key component of deep reinforcement learning, yet some
tasks and designer's objectives may be unnatural to define as a scalar cost
function. Among the various techniques, formal methods integrated with DRL have
garnered considerable attention due to their expressiveness and flexibility to
define the reward and requirements for different states and actions of the
agent. However, how to leverage Signal Temporal Logic (STL) to guide
multi-agent reinforcement learning reward design remains unexplored. Complex
interactions, heterogeneous goals and critical safety requirements in
multi-agent systems make this problem even more challenging. In this paper, we
propose a novel STL-guided multi-agent reinforcement learning framework. The
STL requirements are designed to include both task specifications according to
the objective of each agent and safety specifications, and the robustness
values of the STL specifications are leveraged to generate rewards. We validate
the advantages of our method through empirical studies. The experimental
results demonstrate significant reward performance improvements compared to
MARL without STL guidance, along with a remarkable increase in the overall
safety rate of the multi-agent systems
LawBench: Benchmarking Legal Knowledge of Large Language Models
Large language models (LLMs) have demonstrated strong capabilities in various
aspects. However, when applying them to the highly specialized, safe-critical
legal domain, it is unclear how much legal knowledge they possess and whether
they can reliably perform legal-related tasks. To address this gap, we propose
a comprehensive evaluation benchmark LawBench. LawBench has been meticulously
crafted to have precise assessment of the LLMs' legal capabilities from three
cognitive levels: (1) Legal knowledge memorization: whether LLMs can memorize
needed legal concepts, articles and facts; (2) Legal knowledge understanding:
whether LLMs can comprehend entities, events and relationships within legal
text; (3) Legal knowledge applying: whether LLMs can properly utilize their
legal knowledge and make necessary reasoning steps to solve realistic legal
tasks. LawBench contains 20 diverse tasks covering 5 task types: single-label
classification (SLC), multi-label classification (MLC), regression, extraction
and generation. We perform extensive evaluations of 51 LLMs on LawBench,
including 20 multilingual LLMs, 22 Chinese-oriented LLMs and 9 legal specific
LLMs. The results show that GPT-4 remains the best-performing LLM in the legal
domain, surpassing the others by a significant margin. While fine-tuning LLMs
on legal specific text brings certain improvements, we are still a long way
from obtaining usable and reliable LLMs in legal tasks. All data, model
predictions and evaluation code are released in
https://github.com/open-compass/LawBench/. We hope this benchmark provides
in-depth understanding of the LLMs' domain-specified capabilities and speed up
the development of LLMs in the legal domain
Interaction between O-GlcNAc Modification and Tyrosine Phosphorylation of Prohibitin: Implication for a Novel Binary Switch
Prohibitin (PHB or PHB1) is an evolutionarily conserved, multifunctional protein which is present in various cellular compartments including the plasma membrane. However, mechanisms involved in various functions of PHB are not fully explored yet. Here we report for the first time that PHB interacts with O-linked β-N-acetylglucosamine transferase (O-GlcNAc transferase, OGT) and is O-GlcNAc modified; and also undergoes tyrosine phosphorylation in response to insulin. Tyrosine 114 (Tyr114) and tyrosine 259 (Tyr259) in PHB are in the close proximity of potential O-GlcNAc sites serine 121 (Ser121) and threonine 258 (Thr258) respectively. Substitution of Tyr114 and Tyr259 residues in PHB with phenylalanine by site-directed mutagenesis results in reduced tyrosine phosphorylation as well as reduced O-GlcNAc modification of PHB. Surprisingly, this also resulted in enhanced tyrosine phosphorylation and activity of OGT. This is attributed to the presence of similar tyrosine motifs in PHB and OGT. Substitution of Ser121 and Thr258 with alanine and isoleucine respectively resulted in attenuation of O-GlcNAc modification and increased tyrosine phosphorylation of PHB suggesting an association between these two dynamic modifications. Sequence analysis of O-GlcNAc modified proteins having known O-GlcNAc modification site(s) or known tyrosine phosphorylation site(s) revealed a strong potential association between these two posttranslational modifications in various proteins. We speculate that O-GlcNAc modification and tyrosine phosphorylation of PHB play an important role in tyrosine kinase signaling pathways including insulin, growth factors and immune receptors signaling. In addition, we propose that O-GlcNAc modification and tyrosine phosphorylation is a novel previously unidentified binary switch which may provide new mechanistic insights into cell signaling pathways and is open for direct experimental examination
SYNTHESIS OF TETRAZINE-BASED COVALENT ORGANIC NETWORKS
After comparison of inorganic, hybrid and organic porous materials, tetrazine based organic porous material was chosen as a target material since it was rigid material providing micro- to mesoscale pores and could do post-synthetic modifications through inverse electron demand Diels-Alder Reaction with different dienophiles. Such networks own the potential as catalysts after modifying with different metal chelation sites. Here, three strategies were provided to synthesize such materials. During of construction of building blocks through double Diels-Alder Reaction, reactivities of tetrazines and dienophiles were studied. The coupling study provided the information of which type of coupling reaction could be used as well as how it was proceeded. The Direct synthesis of tetrazine-based network through formation of tetrazines then broadened the variety materials that could be synthesized.M.S. in Chemistry, May 201
- …